In How Many Ways Can You Reassemble Several Russian Dolls? 



Doron ZEILBERGER^ 

Dedicated to my favorite identical twins: Thotsaporn and Thotsaphon THANATIPANONDA 
Preface 

In the Fall of 1981, I had the pleasure and honor of living next-door to the eminent combinatorialist 
Joel Spencer, who was then a visiting professor at the Weizmann Institute of Science in Israel. Joel 
was always glad to talk to me, provided that I spoke in Hebrew. Joel is also a great punner, and 
was very proud when he made his first pun in Hebrew, during a volleyball game with other math 
faculty and students, when he said "kadur shell" that could mean "my ball" but also "Lee [Segal] 's 
ball". One day I got as a present a set of "Russian Dolls" (a.k.a. as Matryoshkas or Babushkas), 
which is a nested set of dolls. Trying to test Joel's knowledge of enumeration (after all he is an 
expert in "Hungarian", rather than enumerative, combinatorics), I asked him: 

In how many ways can one reassemble an n-nested Russian Doll?, 

and he immediately replied: This is a Stirling question, it rings a Bell. 

Several Russian Dolls 

Almost thirty years later, my brilliant student, Thotsaporn "Aek" Thanatipanonda asked me what 
happens if you have several, say r, identical Russian Dolls. Thanks to Sloane, MathSciNet, and 
Google scholar, we quickly found out that the case r = 2 goes back to Comtet[C] and is featured in 
Sloane's A020554. See also [Ba] and [R], and for an insightful species treatment see [L] and [P]. 
The case r = 3 was nicely handled by Ed Bender [Be], while the general case was given its coup de 
grace by John Devitt and David Jackson [D J], who used a very ingenious generatingfunctionology 
approach. 

As Herb Wilf[Wl] famously said, an "answer" to an enumeration question is an efficient algorithm 
to generate many terms in the enumerating sequence. Explicit formulas are just one such way, 
and often not the most efficient one! In this article, I will describe a "calculus" approach, using 
differential operators, that has the following advantages. 

1. It is somewhat faster than using the Devitt- Jackson [D J] complicated exponential generating 
function. 
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2. It is so flexible that it can handle non-identical Russian Dolls. Given ai single-births, 02 pairs 

of (identical) twins, 03 sets of (identical) triplets, 04 sets of (identical) quadruplets, ... , Ofc 
sets of (identical) fc-tuplets, (and you can't tell identical twins etc. apart form each other), 
in how many ways can you partition them into (not necessarily distinct) sets? This is useful if 
the school principal has to assign them into (non-ordered) classes such that no identical-looking 
children would be in the same class (or else their teacher won't be able to tell them apart, and the 
children can play tricks on her). 

3. It can be used in conjunction with Wilf's [Wi2] celebrated methodology for random selection 
of combinatorial objects to design quick algorithms for selecting, uniformly at random, a mutli-set 
set-partition of any given multiset. 

4. It beautifully illustrates MacMahon's lovely method of "differential operators" that transcribes 
combinatorial operations into differential operators. MacMahon was very fond of it, and he nicely 
described it in the entry Combinatorial Analysis of the eleventh edition of Encyclopeadia Britannica 
[M]. 

5. It beautifully illustrates my favorite methodology of rigorous experimental mathematics. You 
teach the computer how to do the combinatorics, it derives, all by itself, the (symbolic) differential 
operator (that for r > 3 would be too complicated to derive by hand, let-alone apply, even for 
MacMahon) , and then the computer goes on and uses it to crank-out as many terms as desired in 
the enumerating sequence. 

6. It beautifully illustrates the notion of catalytic variables. These are variables corresponding 
to quantities that we may not care about, but are nevertheless needed in order to facilitate the 
enumeration. At the end of the day, we set them all equal to 1. 

7. Last but not least, it enables me to contribute six new sequences to Sloane, the (beginnings 
of) the enumerating sequence for the number of ways of reassembling r identical n-nested Russian 
Dolls for 3 < r < 8. So far, only r = 1 (the Bell numbers, AOOOllO) and r = 2 (Comtet's sequence 
A020554) are present there. 

The Evolution Differential operator 

A Baby Example: r = 1 

For the sake of pedagogy, let's first treat the classical case of one Russian Doll. 

Suppose that you have a set partition of {1, 2, . . . , n — 1} with k sets. How can we accommodate a 
new-comer n? Either she is shy (or anti-social), and decides to form her own new set {n}, or she is 
outgoing and would like to join an existing set, for which she has k choices. If we call the "number 
of sets" the "state", then in the former case the new comer, n, caused the set-partition to move to 
state k + 1, while in each of the k latter cases, it stayed in state k. If we give state k the weight z^, 
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then each and every set-partition of state k (with weight z'') gives rise to the "evolution" 

_^ ^k+i j^^k _ 

In other words ^ 

z'' ^ (z + z—)z'' . 
dz 

This is true for each and every monomial z'^, and by linearity for each polynomial. So if Pn{z) is 
the sum of the weights of all set-partitions of {1, . . . ,n}, then we have the differential-recurrence 
equation: 

Pn{z)=ViPn-l{z) , 

where Vi is the differential operator 

D,f{z):={z + z^)f{z) . 

The initial condition is Pq{z) = 1. If we are only interested in the total number of set partitions of 
{1, 2, . . . , n}, then at the end of the day we plug-in z = 1, getting 

Bn = Pn{l) . 

This gives a quick way to crank-out a table of the first one thousand (or whatever) Bell numbers, 
that is memory efficient. Once we are at day n, and know P„(z), we can let the computer forget 
about Pn-i{z) (and even about B^-i, once it is printed out). Of course, this is not more efficient 
than using 



or 




but (probably) not less efficient, and besides we only did it as a warm-up. 
A Toddler Example: r = 2 

Suppose that we have n — 1 pairs of identical twins, already arranged into sets. The only restriction 
is that no two identical twins can be in the same set, but sets can be repeated. So we have a 
multiset of sets, whose union, as a multiset, is 1^2^ ... (n — 1)^. 

Of course, no set can be repeated more than twice (why?). Let there be a sets, Ai,. . . ,Aa that 
show up once, and let there be b sets Bi, . . . ,3^ that show up twice. We say that the state of this 
arrangement is (a, 6), and its weight is zfz2- 

The school principal has to place the newly-arrived pair of twins n and n, but he can't put them 
in the same set. There are seven cases. 
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Case 1: Create two new singleton sets {n}^. There is only one way of doing it, and the new state 
is (a, b + 1), with weight z^z\'^^. The corresponding operation on monomials is 

f{zi,Z2) ^ Z2f{zi,Z2) . 

Case 2: Create one new singleton set {n}, and place the other n into one of the existing A[s. 
There are a ways of doing it, and the new state is (a + 1, 6), with weight z'^'^^z^,- The corresponding 
operation on monomials is 

f{zi,Z2) Zl {^1-^^ fizi,Z2) . 

Case 3: Create one new singleton set {n}, and place the other n into one of the existing B'-s. 
There are b ways of doing it, and the new state is (a + 3, 6 — 1), since one of the S's became two 
As, with the help of n. The weight of the new state is z1~^^Z2~^ ■ The corresponding operation on 
monomials is 

f{Zi,Z2) zfz'^ (^^d^) -^^^l'^^) • 

Case 4: Place the two twins n and n into two (different, of course) ^j's. There are (2) ways 
of doing it, and the new state remains (a, 6), with weight ZiZ2- The corresponding operation on 
monomials is ^ 

f{zi,Z2)^^-(^zf^^nZi,Z2) . 

Case 5: Place one of the new twins (n or n) into one of the ^i's, and the other one into one of the 
Bi^s. There are ab ways of doing it, and the new state is (a + 2, 6 — 1), since one of the doubletons 
B's, let's call it B^, was lost, and it became the two distinct sets B^ and B^ U {n}. The new weight 
is z1+'zt'. The corresponding operation on monomials is 

f{zi,Z2) zfz^^ (^^z!") (^^i) -^(^^'^^^ ■ 

Case 6: Place the twins (n and n) into two different Sj's. There are (2) ways of doing it, and the 
new state is (a + 4, 6 — 2), since two of the doubletons S's, let's call them Bi and Bj, were lost, 
and they became the four distinct new sets B^, Bi U {n}, Bj and Bj U {n}. The new weight is 

4. 



2 rjij^g corresponding operation on monomials is 

fizi,Z2) 4Z2^\ (^2^ ) fizi,Z2) 



Case 7: Place both of the new twins (n and n) into each of the two copies of the same B^. There are 
6 ways of doing it, and the new state is the same, (a, 6), since single sets stay single and doubletons 
stay doubletons. The new weight is ZiZ2- The corresponding operation on monomials is 

f{zi,Z2) (^2^ j f{zl,Z2) . 
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Combing, we have just proved: 

(2) 

Fact: Let Pn (-21,^2) be the sum of the weights of all multiset set-partitions of the multiset 
{1^ . . . n^}, with the weight being zi to the power the number of sets that show up once times Z2 
to the power the number of sets that show up twice. Let V2 be the partial-differential operator 
(where D, := ^, D, := 

V2 := Z2D2 + 1/2 zi^D2^ + zi^DiD2 + 1/2 zi^Di^ + zi^Da + zi^D^ + Z2 , 

then 

The Comtet numbers are Pn^\l, 1). 
The general case 

Suppose that we already have a multiset set-partition of {l*" . . . {n — lY}, with ai sets that show-up 
once, 02 sets that show-up twice, . . ., sets that show-up r times. The state of this particular 
multiset set-partion is (ai, 02, . . . , a^), and its weight is z"^ . . . z'^'" . 

We have to place the r identical new comers rf . 

We must make the following decisions 

1. How many of them would start their own singleton sets, say, Cq 

2. For the remaining r — co new members n, for z = 1, 2, . . . , r, how many of them, let's call it Cj, 
would be placed in sets that show up i times. 

After these decisions we have a vector of non-negative integers [cq, ci, . . . , Cr], such that cq -|- ci -|- 
... + Cr = r. 

Once we decided that Cj of the new n's would go to sets that show up i times, we have to decide, 
amongst those siblings which sets should be asked to invite them. These sets can be all different, 
but they could all be different copies of the same set (if Ci <i). This naturally leads to an integer 
partition Aj = I'"i2'"2 3'"3 . . . ^'"i (written in multiplicity notation). 

We have to place rai of these siblings such that each of them goes to different sets. These make mi 
of the formerly a^-repeated sets become (oj — l)-repeated sets and creates mi new singleton sets. 
So ai — > ai -|- mi, and aj_i — > a^-i -|- mi and — > — mi. In terms of differential operators it is 

f{zi,...,Zr) {l/mi\){ziZi-iDi)"^^ f{zi, . . . , Zr) . 

Similarly for 2"*^ we have 

f{zi,...,Zr) {l/m2\){z2Zi-2Di)""' f{zi, . . . , Zr) , 
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and so on. 

So every possible scenario of placing the new identical r siblings of the n family corresponds to an 
r+1 tuple: 

T = [co, Xi, . . . , Xr] , 

where cq is an integer, Ai, A2, . . . , are integer partitions such that the largest part of Aj is < i, 
and 

Co + |Ai| + IA2I + . . . + |Ar| = r . 
For each such scenario corresponds the "monomial" operator 

r 

V[T]:=z,,llQ[Xi] , 

i=l 

where, writing Aj = I™i2™'2 3m3 __i">-i (mi, 777.2 are now local variables, i.e. they are different, of 
course, for each Aj), 

j=i J 

Here £)j := and '■= 1- 

Finally, we can write down the evolution operator : 

T scenario 

and we have the 

Theorem: Let Pn\z\,Z2-, - ■ ■ ,Zr) be the sum of the weights of all multiset set-partions of the 
multiset {V . . .rf}, with the weight being zi to the power the number of sets that show up once 
times Z2 to the power the number of sets that show up twice times . . . times z^ to the power the 
number of sets that show up r times, then 

Pn\zi^ 2^2, • • • , Zr) = P^P^I^ (2:1 , ^2, . . . , Zr) . 

The number of such multi-set set partitions is of course Pji''' (1,1,...,!). 
Non-Identical Russian Dolls 

The operators Vr can be combined to yield the 

Main Theorem: The number of ways of partitioning into sets a multiset consisting of mi ele- 
ments that appear once, 7772 elements that appear twice, . . ., rrir elements that appear r times, or 
equivalently, the multiset 

1 . . . 7771 (rr7i + 1)^ . . . {mi + 7712)^ . . . (ttii -|- . . . -|- 777^-1 -1-1)'"... {mi -|- . . . -|- Tn^-i + 
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is computed as follows. First compute the polynomial in zi, . . . ,Zr- 

P{z^,Z2,...,Zr)=(jlVryi) , 

and then plug-in zi = 1, . . . , Zr = 1. 
Random Generation 

Going back to the Stirling-Bell case, r = 1, the "differential recurrence" Pn{z) = {z + z-^)Pn-i{z) 
is equivalent to the famous recurrence 

S{n,k) = S{n-l,k-l) + kS{n-l,k) . 

This can be used, according to Wilf[W2], to generate uniformly at random, a set-partion with k 
sets as follows. 

First pre-compute a table of S{n, k) using the recurrence. Now roll a loaded coin with probability 
of Heads being S{n — l,k — l)/5'(n, k) and probability of Tails being kS{n — \, k)/S{n, k). If it lands 
Heads, recursively generate a random set-partitions of {1,2,... ,n — 1} with k — 1 sets, and adjoin 
the singleton {n} to it, otherwise generate recursively a random set-partition of {1, 2, . . . , n — 1} 
with k sets, and then roll a fair fc-sided die, and accordingly decide which of the k members of the 
set-partition should invite n to join it. 

If we want a (uniformly) random set partition, then decide on the number of sets /c, by rolling 
a loaded n-faced die with probabilities of it landing k equalling S{n,k)/Bn, and then proceed as 
before. 

The differential-recurrence of the Theorem yields to a partial recurrence for the quantity, let's call 
it S^^^{n; Oi, . . . , a^) for the number of multiset set-partitions of 1*" . . . n*" with Oi sets that show 
up once, . . ., ttr sets that show up r times. Using this the computer (all by itself!) can use the 
Wilf Methodology to create a random-generation algorithm. The programming details are a bit 
daunting, so we leave it challenge to the reader. 

The Maple package BABUSHKAS 

Everything here (except for the random-generation, for which we only have the simple r = 1 case) 
is implemented in the Maple package BABUSHKAS available, via a link, from the webpage of this 
article: ^ttp : //www .math. rutgers . edu/'zeilberg/mamarim/mamarimhtml/babushkas .html 



or directly from: tittp: //www. math. rutgers. edu/'zeiiberg/tokhniot /babushkas . That webpage also con- 



tains sample input and output, including the sequences for 1 < r < 8. 

The main procedure is SeqBrn(r,n) that uses the present approach to generate the first n terms 
of the enumerating sequence for the number of ways of reassembling r identical Russian Dolls. 
SeqBrnDJ(r,n) does the same thing using the Devitt- Jackson approach. We are glad to report 
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that they agree! As yet another check, we have the program SSP, that actually constructs the set 
of all multi-set set partitions of any given multiset, and emables checking, for small values, with 
the naive count. Procedure B(L) handles the case of non-identical Russian Dolls, or equivalently, 
an arbitrary multiset, using the Main Theorem. 

SeqCrn and SeqCrnDJ handle set-partitions of the multiset V . . .n'^, in other words, each set can 
only show up once. This is simply ^^^^^(1,0, ... ,0), in the above notation. 

Full details are available on-line by typing ezra ( ) ; . 

The sequences 

Even though much more data is available in the above-mentioned webpage, and these sequences 
will soon be submitted to Sloane, let us cite the first ten terms for r = 1, 2, 3, 4. 

r = 1 : 1, 2, 5, 15, 52, 203, 877, 4140, 21147, 115975 (the Bell Numbers). 

r = 2 : 1, 3, 16, 139, 1750, 29388, 624889, 16255738, 504717929, 18353177160 (the Comtet numbers) 

r = 3 : 1, 4, 39, 862, 35775, 2406208, 238773109, 32867762616, 6009498859909, 1412846181645855 

r = 4 : 1, 5, 81, 4079, 507549, 127126912, 55643064708, 38715666455777, 40095856807088486, 
58901884724160709571. 
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